Technical Report No: BU-CE-1001 A Discretization Method based on Maximizing the Area Under ROC Curve
نویسندگان
چکیده
We present a new discretization method based on Area under ROC Curve (AUC) measure. Maximum Area under ROC Curve Based Discretization (MAD) is a global, static and supervised discretization method. It discretizes a continuous feature in a way that the AUC based only on that feature is to be maximized. The proposed method is compared with alternative discretization methods such as Entropy-MDLP (Minimum Description Length Principle) which is known as one of the best discretization methods, Fixed Frequency Discretization (FFD), and Proportional Discretization (PD). FFD and PD are proposed recently and designed for naïve Bayes learning. Evaluations are performed in terms of M-Measure, an AUC based metric for multi-class classification, and accuracy values obtained from naïve Bayes and Aggregating One-Dependence Estimators (AODE) algorithms by using real world datasets. Empirical results show that our method is a candidate to be a good alternative to other discretization methods.
منابع مشابه
A Discretization Method Based on Maximizing the Area under Receiver Operating Characteristic Curve
Many machine learning algorithms require the features to be categorical. Hence, they require all numeric-valued data to be discretized into intervals. In this paper, we present a new discretization method based on the receiver operating characteristics (ROC) Curve (AUC) measure. Maximum area under ROC curve-based discretization (MAD) is a global, static and supervised discretization method. MAD...
متن کاملRisk Estimation by Maximizing the Area under ROC Curve
Risks exist in many different domains; medical diagnoses, financial markets, fraud detection and insurance policies are some examples. Various risk measures and risk estimation systems have hitherto been proposed and this paper suggests a new risk estimation method. Risk estimation by maximizing the area under a receiver operating characteristics (ROC) curve (REMARC) defines risk estimation as ...
متن کاملMaximizing the Area under the ROC Curve using Incremental Reduced Error Pruning
The use of incremental reduced error pruning for maximizing the area under the ROC curve (AUC) instead of accuracy is investigated. A commonly used accuracy-based exclusion criterion is shown to include rules that result in concave ROC curves as well as to exclude rules that result in convex ROC curves. A previously proposed exclusion criterion for unordered rule sets, based on the lift, is on ...
متن کاملWhich combination of MR imaging modalities is best for predicting recurrent glioblastoma? Study of diagnostic accuracy and reproducibility.
PURPOSE To compare the added value of dynamic contrast material-enhanced ( CE contrast enhanced ) ( DCE dynamic CE ) magnetic resonance (MR) imaging with that of dynamic susceptibility CE contrast enhanced ( DSC dynamic susceptibility CE ) MR imaging with the combination of CE contrast enhanced T1-weighted imaging and diffusion-weighted ( DW diffusion weighted ) imaging for predicting recurrent...
متن کاملScore Fusion by Maximizing the Area under the ROC Curve
Information fusion is currently a very active research topic aimed at improving the performance of biometric systems. This paper proposes a novel method for optimizing the parameters of a score fusion model based on maximizing an index related to the Area Under the ROC Curve. This approach has the convenience that the fusion parameters are learned without having to specify the client and impost...
متن کامل